Near real-time streaming analysis of big fusion data
نویسندگان
چکیده
Abstract Experiments on fusion plasmas produce high-dimensional data time series with ever increasing magnitude and velocity, but turn-around times for analysis of this have not kept up. For example, many tasks are often performed in a manual, ad-hoc manner some after an experiment. In article we introduce the DELTA framework that facilitates near real-time streaming big fast data. By measurement from experiments to high-performance compute center, allows computationally expensive be between plasma pulses. This describe modular expandable software architecture present performance benchmarks individual components as well example workflows. Focusing workflow where Electron cyclotron emission imaging (ECEi) measured at KSTAR NERSC's supercomputer routinely observe transfer rates about 4 Gigabit per second. At NERSC, demanding turbulence effectively utilizes multiple nodes graphical processing units executes under 5 minutes. We further discuss how uses modern database systems container orchestration services provide web-based visualization. case ECEi demonstrate visualizations can augmented outputs machine learning models. providing session leaders physics operators results higher order using live may make more informed decisions configure next shot.
منابع مشابه
BIDCEP: A Vision of Big Data Complex Event Processing for Near Real Time Data Streaming
This position paper aims to trigger a technical discussion by proposing a conceptual architecture for big data streaming integrated with complex event processing (BiDCEP). BiDCEP expands the Lambda and Kappa (LK) architectures for big data streaming to fit the complex event processing (CEP) and event management domains of enterprise IT. BiDCEP links CEP components as defined in previous work of...
متن کاملBeyond Batch Processing: Towards Real-Time and Streaming Big Data
Today, big data is generated from many sources and there is a huge demand for storing, managing, processing, and querying on big data. The MapReduce model and its counterpart open source implementation Hadoop, has proven itself as the de facto solution to big data processing. Hadoop is inherently designed for batch and high throughput processing jobs. Although Hadoop is very suitable for batch ...
متن کاملNear-Real-Time OGC Catalogue Service for Geoscience Big Data
Geoscience data are typically big data, and they are distributed in various agencies and individuals worldwide. Efficient data sharing and interoperability are important for managing and applying geoscience data. The OGC (Open Geospatial Consortium) Catalogue Service for the Web (CSW) is an open interoperability standard for supporting the discovery of geospatial data. In the past, regular OGC ...
متن کاملFuzzy Data Envelopment Analysis for Classification of Streaming Data
The classification of fuzzy uncertain data is considered one of the most challenging issues in data analysis. In spite of the significance of fuzzy data in mathematical programming, the development of the analytical methods of fuzzy data is slow. Therefore, the current study proposes a new fuzzy data classification method based on fuzzy data envelopment analysis (DEA) which can handle strea...
متن کاملTowards Near Real-Time BGP Deep Analysis: A Big-Data Approach
BGP (Border Gateway Protocol) serves as the primary routing protocol for the Internet, enabling Autonomous Systems (individual network operators) to exchange network reachability information. Alongside significant on-going research and development efforts, there is a practical need to understand the nature of events that occur on the Internet. Network operators are acutely aware of security-rel...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Plasma Physics and Controlled Fusion
سال: 2021
ISSN: ['1361-6587', '0741-3335']
DOI: https://doi.org/10.1088/1361-6587/ac3f42